Search CORE

26 research outputs found

Modeling multiple time units delayed gene regulatory network using dynamic Bayesian network.

Author: Xing Zhengzheng
Publication venue: 'University of Windsor Leddy Library'
Publication date: 01/01/2006
Field of study

Effective Proxy for Human Labeling: Ensemble Disagreement Scores in Large Language Models for Industrial NLP

Author: Advani Laksh
Colak Aaron
Du Wei
Gambhir Yashmeet
Perry Daniel J
Shiralkar Prashant
Xing Zhengzheng
Publication venue
Publication date: 11/09/2023
Field of study

Large language models (LLMs) have demonstrated significant capability to generalize across a large number of NLP tasks. For industry applications, it is imperative to assess the performance of the LLM on unlabeled production data from time to time to validate for a real-world setting. Human labeling to assess model error requires considerable expense and time delay. Here we demonstrate that ensemble disagreement scores work well as a proxy for human labeling for language models in zero-shot, few-shot, and fine-tuned settings, per our evaluation on keyphrase extraction (KPE) task. We measure fidelity of the results by comparing to true error measured from human labeled ground truth. We contrast with the alternative of using another LLM as a source of machine labels, or silver labels. Results across various languages and domains show disagreement scores provide a better estimation of model performance with mean average error (MAE) as low as 0.4% and on average 13.8% better than using silver labels

arXiv.org e-Print Archive

Early classification on temporal sequences

Author: Xing Zhengzheng
Publication venue
Publication date: 08/10/2010
Field of study

Early classification of temporal sequences has applications in, for example, health informatics, intrusion detection, anomaly detection, and scientific and engineering sequence data monitoring. Comparing to learning conventional sequence classifiers, learning early classifiers is a more challenging task and has not been systematically studied before. In this work, we identify the problem of early classification and develop a series of classifiers for temporal sequence early classification. The proposed classifiers are designed for different types of temporal sequences including symbolic sequences and time series. Furthermore, the proposed classifiers have several desirable characteristics which are useful in different application scenarios. We evaluate our approaches on a broad range of real data sets and demonstrate that the classifiers can achieve competitive classification accuracies with great earliness. Also, the classifiers can extract interpretable features from sequences for better understanding

Simon Fraser University Institutional Repository

Consensus Formation Control and Obstacle Avoidance of Multiagent Systems with Directed Topology

Author: Bingyou Liu
Lichao Wang
Xing Li
Zhengzheng Zhang
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2020
Field of study

This study addresses the problems of formation control and obstacle avoidance for a class of second-order multiagent systems with directed topology. Formation and velocity control laws are designed to solve the formation tracking problem. A new obstacle avoidance control law is also proposed to avoid obstacles. Then, the consensus control protocol consists of the formation, velocity, and obstacle avoidance control laws. The convergence of the proposed control protocol is analyzed by a redesigned Lyapunov function. Finally, the effectiveness of theoretical results is illustrated by simulation examples. The simulation results show that the formation tracking problem of the given multiagent systems can be realized and obstacles can be avoided under the proposed control protocol

Directory of Open Access Journals

Mining sequence classifiers for early prediction

Author: Guozhu Dong
Jian Pei
Philip S. Yu
Zhengzheng Xing
Publication venue
Publication date: 01/01/2008
Field of study

Supervised learning on sequence data, also known as sequence classification, has been well recognized as an important data mining task with many significant applications. Since temporal order is important in sequence data, in many critical applications of sequence classification such as medical diagnosis and disaster prediction, early prediction is a highly desirable feature of sequence classifiers. In early prediction, a sequence classifier should use a prefix of a sequence as short as possible to make a reasonably accurate prediction. To the best of our knowledge, early prediction on sequence data has not been studied systematically. In this paper, we identify the novel problem of mining sequence classifiers for early prediction. We analyze the problem and the challenges. As the first attempt to tackle the problem, we propose two interesting methods. The sequential classification rule (SCR) method mines a set of sequential classification rules as a classifier. A so-called early-prediction utility is defined and used to select features and rules. The generalized sequential decision tree (GSDT) method adopts a divide-and-conquer strategy to generate a classification model. We conduct an extensive empirical evaluation on several real data sets. Interestingly, our two methods achieve accuracy comparable to that of the stateof-the-art methods, but typically need to use only very short prefixes of the sequences. The results clearly indicate that early prediction is highly feasible and effective.

CiteSeerX

CORE